feat(ai): cost-aware routing + per-feature daily budget (Phase 12, AI-078 slice 4)#369
Merged
Conversation
…e 12 slice 4) Per-feature daily USD budgets with cost-aware enforcement. - RollingSpendTracker (singleton, Core ISpendTracker): per-feature UTC-daily buckets, lock-free long micro-dollars (Interlocked.Add, Math.Round). Day rollover via TimeProvider. Lazy seed from llm_traces scaled by 1/sample-rate (traces are sampled; explain=0.1) so a mid-day restart doesn't reset to $0. Recording is in the gateway (unsampled, exactly once per call), not the sampled tracer. - Gateway enforcement: spend >= DailyUsd -> mode fallback reroutes to a cheaper provider (e.g. free ollama); mode hardstop -> BudgetExceededException -> 429. NEVER breaks a live call (any failure -> true primary + log). Shadow unaffected (compares true primary); shadow spend not counted. - 80% admin alert: edge-triggered, deduped per (feature, day), fire-and-forget via Resend (no-op if unset). - Admin GET /budgets + Daily budgets section on Summary tab (reads tracker for budgeted features only). - Budgets OFF by default (Ai:Budgets empty); no migration. architect -> backend+frontend -> adversarial QA (SHIP; money-count/edge-alert/ never-break verified, P2 fixed). 761 unit tests green; admin tsc+build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 12 (RLOps) — slice 4: cost-aware routing + daily budgets
Per-feature daily USD budgets — the DoD "cost-aware routing cuts spend" lever.
Spend tracking
RollingSpendTracker(singleton, CoreISpendTracker) accumulates per-feature spend in UTC-daily buckets as lock-freelongmicro-dollars (Interlocked.Add,Math.Roundaway-from-zero — no truncation drift). Lazy day rollover via injectedTimeProvider. First touch each day seeds fromllm_tracesscaled by 1/sample-rate (the tracer samples — explain=0.1 in prod, so raw sum is 10× low) so a mid-day restart doesn't reset the budget to $0. Recording happens in the gateway (unsampled, exactly once perCompleteAsync/StreamAsync), not the sampled tracer.Enforcement (ModelGateway)
Spend ≥
DailyUsd→ mode fallback reroutes to a cheaper provider key (e.g. free localollama); mode hardstop throwsBudgetExceededException→ 429. Budget logic can never break a live call — any tracker/config failure or unregistered fallback falls through to the true primary + logs. Shadow unaffected (still compares the TRUE primary, not the fallback); shadow spend doesn't count against the primary budget.80% alert
Edge-triggered (once on crossing 0.8×budget, not per call), deduped per (feature, day), fire-and-forget via
ResendEmailService(no-op ifResend:AdminAlertEmailempty), refires next day.Admin
GET /admin/ai-quality/budgets+ a "Daily budgets" section on the Summary tab — per-feature today-spend vs cap, color-coded % bar, mode, "in fallback" badge.Scope / safety
Budgets OFF by default (
Ai:Budgetsempty) → zero behavior change until configured (Features:{feature}:{DailyUsd,Fallback,Mode}). No migration (in-memory + config). Multi-replica caveat: per-replica counters (lazy DB seed bounds drift; periodic re-seed = named follow-up); prod single-replica.Verification
architect → backend + frontend (parallel, locked contract) → adversarial QA (SHIP): money-counting (round-once, record-exactly-once, shadow-excluded, seed-divide-not-multiply), edge-triggered alert (hammer-tested, no email storm), and never-break paths all verified; P2 (read-only endpoint seeding unbudgeted features) fixed. 761 unit tests green; admin tsc + build clean. Browser-check deferred to owner (needs a budget configured + admin JWT).
🤖 Generated with Claude Code